SVM based feature extraction in speech synthesis

نویسندگان

Peter Cahill

Jan Macek

Julie Carson-Berndsen

چکیده

Annotations of speech recordings are a fundamental part of any unit selection speech synthesiser. However, obtaining flawless annotations is an almost impossible task. Manual techniques can achieve the most accurate annotations, provided that enough time is available to analyse every phone individually. Automatic annotation techniques are a lot faster than manual, doing the task in a much more reasonable time frame, but such annotations contain a considerable amount of error. In this paper a technique is introduced that can quite accurately ensure a degree of articulatory-acoustic similarity between annotated units. The synthesiser will encourage the use of units that have been identified to have appropriate articulatory-acoustic parameters, but will not limit the domain of the speech database. This helps to identify where joins can be performed best and also identifies which annotations should be avoided at the phone level.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Synthetic Speech Spoofing Detection using MFCC and SVM

Nowadays synthetic voice is frequently used to defraud a biometric access system which are speaker recognition based. This paper presents synthetic speech detection in automatic speaker verification system (ASV) for the purpose of spoof detection. Feature extraction is done by canonical Mel Frequency Cepstral Coefficients (MFCC) algorithm and classification of natural and synthetic voice are do...

متن کامل

Speaker Recognition Using DWT- MFCC with Multi-SVM Classifier

This paper describes a hybrid technique for speaker recognition. Speaker recognition is that the method of identifying the person based on characteristics like pitch, tone, Cepstral coefficients in the speech wave. Here DWT and MFCC technique is employed for feature extraction. A mix of two or lot of techniques is named hybrid technique. DWT means divide the speech signal completely into differ...

متن کامل

تشخیص لهجه های زبان فارسی از روی سیگنال گفتار با استفاده از روش های استخراج ویژگی کارآمد و ترکیب طبقه بندها

Speech recognition has achieved great improvements recently. However, robustness is still one of the big problems, e.g. performance of recognition fluctuates sharply depending on the speaker, especially when the speaker has strong accent and difference Accents dramatically decrease the accuracy of an ASR system. In this paper we apply three new methods of feature extraction including Spectral C...

متن کامل

Feature extraction in opinion mining through Persian reviews

Opinion mining deals with an analysis of user reviews for extracting their opinions, sentiments and demands in a specific area, which can play an important role in making major decisions in such area. In general, opinion mining extracts user reviews at three levels of document, sentence and feature. Opinion mining at the feature level is taken into consideration more than the other two levels d...

متن کامل

Comparative Study: HMM&SVM for Automatic Articulatory Feature Extraction

Generally speech recognition systems make use of acoustic features as a representation of speech for further processing. These acoustic features are usually based on human auditory perception or signal processing. More recently, Articulatory Feature (AF) based speech representations have been investigated by a number of speech technology researchers. Articulatory features are motivated by lingu...

متن کامل